9.2.1 生态位理论基础
## 理论基础与说明:
研究生态位理论主要从三个层面展开:
一个是从物种组成层面,一个是从系统发育角度,另外一个是从群落构造研究的角度出发;
## 最早期的生态位研究是利用物种生态幅来反映种间或者群落/群从间生态位差异;但从单一维度并不解释复合现象,需要从更多维度来构建约束分析探究物种的生长限制和环境约束。而且当时主要受限于研究和统计环境环境背景无法只能从局域尺度展开。
## 中期的生态位研究分为两个主要发展方向:
第一是生态位理论伴随着群落构建理论的构建,特别是连续体理论和顶级群落格局相关理论的提出,促进了生态位理论在群落中构建中补充和完善。此时关于物种在局域环境尺度上考虑到更多更复杂的环境背景和生态交互作用对于种群功能生态位的作用和影响;
第二是基于MacArthur的消费者-资源利用模型,将生物体的功能性状和种群水平地理预测或者响应环境变化工作联系起来进行综合统计学分析。此方向的典型工作是将植物的机理生化特征与非生物环境因子相结合来探究生物地理分布趋势、生态位保守性与进化特征。与早期的建模类似,都是自下而上的研究范式;
中期的建模研究同样面临取样困境,植物的生理性状难以大规模取样。现在新开发的生物功能特征数据库,也不是很方便应用;
## 现代的分析:更多的是从系统发育和群落构建两个角度出发:
从系统发育角度分析,利用近缘种或者同属下物种的生态地理分布与分子谱系进化关系来探究物种分布的地理驱动力和环境驱动力,以及物种间生态位的差异性是否和谱系进化关系相对应。
从群落构建的角度出发,着重探讨生物多样性的范式在不同regional及以上尺度的差异。这种差异可以通过多物种同时建模,然后构建JSDM来分析群落组建过程中的不同尺度的多样性差异。
9.2.2 生态位进化研究常用R包
9.2.2 生态位进化研究常用R包
## 生态位进化常用R包:
## 单物种建模/批量建模/系统进化发生关系
# 包括:
ENMTOOLS包:https://github.com/danlwarren/ENMTools(##地理空间)
ecospat包:https://www.unil.ch/ecospat/en/home/menuguid/ecospat-resources/tools.html(##生态空间)
humboldt包:https://github.com/jasonleebrown/humboldt
biomod2包:https://github.com/biomodhub/biomod2
## 多物种建模 SSDM / JSDM
# 包括:
## SSDM:https://cran.r-project.org/web/packages/SSDM/vignettes/SSDM.html
## Hmsc:https://github.com/hmsc-r/HMSC
## JSDM:https://cran.r-project.org/web/packages/jSDM/vignettes/jSDM.html
# 联合物种建模及其教学视频:
https://www.youtube.com/watch?v=BbKw2Vp7cHw
9.2.2.1 R:ENMTOOLS包
###关于此包的安装:
##这可能是R历史最难安装的 R包之一,安装过程中会遇到各种报错:
## 解决 devtools::install_github 的 SSL/TLS connection failed 问题
参考 https://blog.csdn.net/Allen_jinjie/article/details/103361386/
library(devtools)
install_github("danlwarren/ENMTools")
library(ENMTools)
## 参考文献:
http://onlinelibrary.wiley.com/doi/10.1111/geb.12455/pdf
9.2.2.1.1 R:ENMTOOLS运行模型
```{r eval =F}
构建enmtools.species objects----
构建环境,内置raster包
library(raster) env <- raster::getData('worldclim', var='bio', res=10) env <- crop(env, extent(-10, 17, 39, 48)) plot(env[[1]]) env <- setMinMax(env) ## 确定极值范围,以助于绘图;
配置单个物种属性:----
monticola <- enmtools.species() ##创建空白物种 names(monticola) monticola.path <- paste(system.file(package="ENMTools"), "/monticola.csv", sep='') monticola$species.name <- "monticola" monticola$presence.points <- read.csv(monticola.path)
下面的单位应该是米,但在实际的地理分析时,需要确定好这个buffer的范围;
monticola$range <-background.raster.buffer(monticola$presence.points, 50000, mask = env) monticola$background.points <- background.points.buffer(points = monticola$presence.points,radius = 20000, n = 1000, mask = env[[1]])
构建模型运行 环境:
其中maxent需要另外加参加,参见dismo包;
monticola.gam < - enmtools.gam(矶,ENV,˚F = PRES 〜聚(BIO1,2)+聚(bio7,2)*聚(bio12,2),test.prop = 0.2) monticola.dm < - enmtools .dm(monticola,env,test.prop = 0.2) monticola.bc < -enmtools.bc(monticola,env,test.prop = 0.2) monticola.mx < -enmtools.maxent(monticola,env,test.prop = 0.2)
查看模型运行后的相应曲线;
monticola.glm $ response.plots
查看建模结果在二维图上的投影;
这个工作的意义在于可以看到两个环境因子的非线性组合和概率分布之间的关系:
visualize.enm(monticola.glm, env, layers = c("bio1", "bio12"), plot.test.data = TRUE)
$background.plot
$suit.plot
###### 9.2.2.1.2 R:ENMTOOLS生态位进化
计测生态位宽度、相关性和重叠性:
计测生态位宽度(从地理空间)
raster.breadth(monticola.glm)
计测两物种生态位重叠:
env.overlap(monticola.glm, cyreni.glm, env, tolerance = .001)
生态位(Niche identity)
原理说明:
生态位等价性测试的原理是将两组物种分布点构建集合,然后在随机分配给每个虚拟物种,对构建的虚拟物种分布点建立enm,计算重叠性 该模型的零假设是物种在环境之中的出现实际上是来自相同基础分布的随机抽样;
此外:
作者还提到,生态位进化只是两个物种的实际分布可能导致偏离零假设的众多原因之一(我伦,2014)。有另外两种对于物种分布起到重要限制因子的因素:生物相互作用和空间自相关;
R实现:
id.glm <- identity.test(species.1 = monticola, species.2 = cyreni, env = env, type = "glm", nreps = 4)
绘制结果图:
id.glm
生态位背景相似性检测:
原理简要说明;
基于生态位等价性 假设,如果结果显著可以反映两物种分布的异域性,但正如沃尔丹所说,生态位进化只是造成两物种分布差异的原因,还有可能是空间自相关和种间相互作用的原因; 基于空间自相关的物种分布随机抽样特征会导致物种在局域环境尺度上在构建的地理环境空间中体现出独特的地理信号特征,但可以通过置换周围环境背景的方式来检验是否存在环境空间自相关。 本检验还可以反映栖息地的可用性,并询问给定种群或物种之间存存在的环境中存在可用环境集合空间中,所观测到的物种或者种群之间的相似性是否明显大于预期?
检验分为两种:对称性检验和非对称性检验:
非对称性检验:X和Y为两物种的研究区域,其中Ox和Oy是物种的实际分布发生点。对Y区域进行随机抽样,构建Obgy作为虚拟物种,然后计算由ENM_X和ENM_Obgy的overlaps;原始overlap为ENM_X和ENM_Y;
对称性检验:X和Y为两物种的研究区域,其中Ox和Oy是物种的实际分布发生点。
对Y区域进行随机抽样,构建Obgy作为虚拟物种;对X区域进行随机抽样,构建Obgx作为虚拟物种;然后计算由ENM_Obgx和ENM_Obgy所构建的overlaps。原始overlap为ENM_X和ENM_Y;
R实现:
不对称分析:
bg.bc.asym <- background.test(species.1 = monticola, species.2 = cyreni, env = env, type = "bc", nreps = 4, test.type = "asymmetric" )
对称 分析:
bg.dm.sym <- background.test(species.1 = monticola, species.2 = cyreni, env = env, type = "dm", nreps = 4, test.type = "symmetric" )
由ecospat tests:
当给env提供超过两维的环境,enmtools会自动进行pca分析提取主轴;
id_test
esp.id <- enmtools.ecospat.id(monticola, cyreni, env[[c("bio1", "bio12")]])
bakground_tests:
esp.bg.sym <- enmtools.ecospat.bg(monticola, cyreni, env[[c("bio1", "bio12")]], test.type = "symmetric")
范围测试(Rangebreak tests)
反映栖息地的相似性和栖息地是否存在某种确定性的间隔(Glor and Warren 2011)
R实现:
rbl.glm <- rangebreak.linear(monticola, cyreni, env, type = "glm", nreps = 4) rbl.glm
分割显著性测试:
rbb.bc <- rangebreak.blob(monticola, cyreni, env, type = "bc", nreps = 4)
多个物种的范围测试:
rbr.glm <- rangebreak.ribbon(monticola, cyreni, ribbon, env, type = "glm", f = pres ~ poly(bio1, 2) + poly(bio12, 2) + poly(bio7, 2), width = 0.5, nreps = 4)
###### 9.2.2.1.3 R:ENMTOOLS系统发育
```{r eval =F}
## 构建进化分支,计算年龄重叠:
## 实际上是在进化谱系图上展示,彼此之间的重叠关系推演:
## 构建进化分支:
# Automate the process of downloading data and removing duds and dupes
## 第一步:加载环境:
hisp.env <- raster::getData('worldclim', var='bio', res=10)
hisp.env <- raster::crop(hisp.env, extent(-75, -65, 16, 21))
hisp.env <- setMinMax(hisp.env)
## 第二步:构建批量下载物种的分布点代码:
## 这个同样也需要先去gbif查看准确的物种名:
species.from.gbif <- function(genus, species, name = NA, env){
# Name it after the species epithet unless told otherwise
if(is.na(name)){
name <- species
}
# Get GBIF data
this.sp <- enmtools.species(presence.points = gbif(genus = genus, species = species)[,c("lon", "lat")],
species.name = name)
# Rename columns, get rid of duds
colnames(this.sp$presence.points) <- c("Longitude", "Latitude")
this.sp$presence.points <- this.sp$presence.points[complete.cases(extract(env, this.sp$presence.points)),]
this.sp$presence.points <- this.sp$presence.points[!duplicated(this.sp$presence.points),]
this.sp$range <- background.raster.buffer(this.sp$presence.points, 50000, mask = hisp.env)
return(this.sp)
}
##获取分支数据:
## 这个构建过程中本质上是构架一个范围和一个分布点,注意需要给分布点更换名称:colnames(this.sp$presence.points) <- c("Longitude", "Latitude")
brevirostris <- species.from.gbif(genus = "Anolis", species = "brevirostris", env = hisp.env)
marron <- species.from.gbif(genus = "Anolis", species = "marron", env = hisp.env)
caudalis <- species.from.gbif(genus = "Anolis", species = "caudalis", env = hisp.env)
websteri <- species.from.gbif(genus = "Anolis", species = "websteri", env = hisp.env)
distichus <- species.from.gbif(genus = "Anolis", species = "distichus", env = hisp.env)
## 构建进化分支:
## env =hisp.anoles,env为一个setMinMax()的文件夹
brev.clade <- enmtools.clade(species = list(brevirostris, marron, caudalis, websteri, distichus), tree = hisp.anoles)
check.clade(brev.clade)
## Age-overlap correlation tests (AOC)
##基于已有表达phy关系,进行环境生态位重叠计算,并进行蒙特卡洛置换检验计算,用于评估数据
##范围重叠计算:
range.aoc <- enmtools.aoc(clade = iberolacerta.clade, nreps = 50, overlap.source = "range")
summary(range.aoc)
## 点重叠计算:
point.aoc <- enmtools.aoc(clade = iberolacerta.clade, nreps = 50, overlap.source = "points")
summary(point.aoc)
## ENM重叠计算:
glm.aoc <- enmtools.aoc(clade = iberolacerta.clade, env = env, nreps = 50, overlap.source = "glm", f = pres ~ poly(bio1, 2) + poly(bio12, 2))
9.2.2.2 R:ecospat包
```{r eval =F}
关于此包的作者的一些答疑:
https://docs.google.com/document/d/1Qo2n0GBbIk3sNOGqAQvi_GRwJlU4vTV3Aqa_JC5rtbo/edit
结果显著性总是0.0198?
正常的,因为当模拟分布和实际分布有较大差异时(实际分布在其模拟分布的柱状图右侧)时,p值会等于0.0198
如何选择背景数据?
选择背景数据需要根据物种的 分布情况和研究对象来定 ,在该网站上作者介绍了合适选择哪种脚本运行的方法:
同域物种分布:
适合分布区紧邻的近缘种或者其他物种组合形式,采用同域物种分布模型脚本(## 在该脚本下,所以同域分布的物种采用同种环境构建) 近缘种分布区间隔较远的物种研究(## 如入侵种研究适合异域分布脚本),采用异域分布脚本计算;该脚本运行时的方法是采用待比较的物种分别按照环境buffer构建环境背景范围;
和worrdan生态位计算重叠的差异原因?
沃尔丹的计算是基于地理环境空间,而布鲁尼的计算则是基于环境空间的核平划。
###### 9.2.2.2.1 R:ecospat包同域分布脚本
```{r eval =F}
## written by Olivier Broennimann. Departement of Ecology and Evolution (DEE).
## 16.01.2019. University of Lausanne. Department of ecology and evolution, Switzerland
t1=Sys.time()
#install.packages("devtools")
#library(devtools)
#install_github(repo="ecospat/ecospat/ecospat",ref="3.1")
library(ecospat)
library(ade4)
library(raster)
###########################
#### data preparation #####
###########################
# create mask for background
BRA<-getData('GADM', country='BRA', level=1)
regions<-c("Rio Grande do Sul","Santa Catarina","Paraná","São Paulo","Rio de Janeiro","Minas Gerais","Espírito Santo","Bahia","Sergipe","Alagoas","Pernambuco","Paraíba","Rio Grande do Norte")
bkg<-aggregate(BRA[BRA$NAME_1%in%regions,])
# get climate data
clim<-getData('worldclim', var='bio', res=10)
# choose variables
var<-c("bio1","bio5","bio7","bio12","bio16","bio18")
clim<-clim[[which(names(clim)%in%var)]]
clim<-mask(crop(clim,bbox(bkg)),bkg)
# load occurence sites for the species (column names should be x,y)
file<-"https://www.unil.ch/ecospat/files/live/sites/ecospat/files/shared/Files_site/sp.txt"
occ<-na.exclude(read.delim(file,h=T,sep="\t"))
# extract climate data from the rasters
clim.bkg<-na.exclude(data.frame(extract(clim,bkg)))
clim.occ<-na.exclude(data.frame(extract(clim,occ[,2:3])))
# species list
sp.list<-levels(occ[,1])
sp.nbocc<-c()
for (i in 1:length(sp.list)){
sp.nbocc<-c(sp.nbocc,length(which(occ[,1] == sp.list[i])))
} #calculate the nb of occurences per species
sp.list<-sp.list[sp.nbocc>4] # remove species with less than 5 occurences
nb.sp<-length(sp.list) #nb of species
################################
#### niche quantifications #####
################################
# calibration of PCA-env
pca.env <-dudi.pca(clim.bkg, center = T, scale = T, scannf = F, nf = 2)
# selection of species to analyze
sp.choice<- c("sp11","sp5","sp2") #CHOOSE THE SET OF SPECIES FOR PAIRWISE ANALYSES
sp.combn<-combn(sp.choice,2)
nsp<-ncol(sp.combn)
# storage matrices
overlap<-matrix(nrow=nsp,ncol=nsp,dimnames = list(sp.choice,sp.choice)) # store overlap values
equivalency<-matrix(nrow=nsp,ncol=nsp,dimnames = list(sp.choice,sp.choice)) #store p-values for equivalency tests
similarity<-matrix(nrow=nsp,ncol=nsp,dimnames = list(sp.choice,sp.choice)) #store p-values for similarity tests
# loop of niche quantification for each combination of species
for(i in 1:ncol(sp.combn)) {
spa<-sp.combn[1,i] #name of species a
spb<-sp.combn[2,i] #name of species b
clim.spa<-clim.occ[occ$Species==spa,] #env data for species a
clim.spb<-clim.occ[occ$Species==spb,] #env data for species b
# PCA scores
scores.bkg<- pca.env$li #scores for global climate
scores.spa<- suprow(pca.env,clim.spa)$lisup #scores for spa
scores.spb<- suprow(pca.env,clim.spb)$lisup #scores for spb
# calculation of occurence density
za<- ecospat.grid.clim.dyn(scores.bkg,scores.bkg,scores.spa,100)
zb<- ecospat.grid.clim.dyn(scores.bkg,scores.bkg,scores.spb,100)
# overlap corrected by availabilty of background conditions
ecospat.niche.overlap(za,zb,cor=F)
# test of niche equivalency and similarity
equ<-ecospat.niche.equivalency.test(za,zb,rep=10) #rep=10 because it's slow, put at least 100 for final analyses
sim<-ecospat.niche.similarity.test(za,zb,rep=100,rand.type = 1)
# both za and zb are randomly shifted in the background (previous versions of ecospat implemented rand.type =2)
#storage of values
overlap[sp.combn[1,i],sp.combn[2,i]]<-ecospat.niche.overlap(za,zb,cor=T)[[1]] #store overlap value
equivalency[sp.combn[1,i],sp.combn[2,i]]<-equ$p.D #store equivalency value
similarity[sp.combn[1,i],sp.combn[2,i]]<-sim$p.D #store similarity 21 value
#plot
ecospat.plot.niche(za,title=spa,name.axis1="PC1",name.axis2="PC2")
ecospat.plot.niche(zb,title=spb,name.axis1="PC1",name.axis2="PC2")
ecospat.plot.contrib(pca.env$co,pca.env$eig)
ecospat.plot.overlap.test(equ,"D","Equivalency")
ecospat.plot.overlap.test(sim,"D","Similarity")
#counter
cat(paste(i))
}
## niche position and breadth
niche<-matrix(nrow=nsp,ncol=4,dimnames = list(sp.choice,c("pos1","breadth1","pos2","breadth2"))) #matrix to store niche caracteristics
for(i in 1:length(sp.choice)) { #for each chosen species
R=100 #dimension of gridding of env space
sp<-sp.choice[i]
clim.sp<-clim.occ[occ$Species==sp,]
scores.sp<- suprow(pca.env,clim.sp)$lisup
z<- ecospat.grid.clim.dyn(scores.bkg,scores.bkg,scores.sp,R) # calculation of occurence density
c<-sample(1:(R*R),100,prob=values(z$z.uncor)) #row index of 1000 random pixel weighted by density along PC1 and PC2
y=(c%/%R)+1;x=c%%R # coordinates of the pixels along CP1 and CP2
CP.sim<-z$x[x] # scores of random pixels on CP1
niche[i,1]<-median(CP.sim) # niche position on CP1
niche[i,2]<-var(CP.sim) # niche breadth on CP1
CP2.sim<-z$y[y] # scores of random pixels on CP2
niche[i,3]<-median(CP2.sim) # niche position on CP2
niche[i,4]<-var(CP2.sim) # niche breadth on CP2
}
t2=Sys.time()
t2-t1 #Time difference of 1.704341 mins
9.2.2.2.2 R:ecospat包异域分布脚本
```{r eval =F}
written by Olivier Broennimann. Departement of Ecology and Evolution (DEE).
16.01.2019. University of Lausanne. Department of ecology and evolution, Switzerland
t1=Sys.time()
install.packages("devtools")
library(devtools)
install_github(repo="ecospat/ecospat/ecospat",ref="3.1")
library(ecospat) library(ade4) library(raster) library(rworldmap)
#
data preparation
#
get climate data for the two area
clim<-getData('worldclim', var='bio', res=10)
choice of variables
var<-c("bio1","bio5","bio7","bio12","bio16","bio18") clim<-clim[[which(names(clim)%in%var)]]
create mask for native area background
data(countriesCoarseLessIslands) #countries from rworldmap ctry = c("AUT","BEL","BGR","HRV","CYP","CZE","DNK","EST","FIN","FRA","DEU","GRC","HUN","IRL", "ITA","LVA","LTU","LUX","MLT","NLD","POL","PRT","ROU","SVK","SVN","ESP","SWE","GBR", "NOR","CHE","BLR","UKR","MDA","MNE","SRB","BIH","ALB","MKD") bkg.eu<-aggregate(countriesCoarseLessIslands[countriesCoarseLessIslands$ADM0_A3%in%ctry,])
create mask for invaded area background
ctry = c("CAN", "USA") bkg.nam<-aggregate(countriesCoarseLessIslands[countriesCoarseLessIslands$ADM0_A3%in%ctry,])
extract climate data from the rasters
clim.bkg.eu<-mask(crop(clim,bbox(bkg.eu)),bkg.eu) clim.bkg.nam<-mask(crop(clim,bbox(bkg.nam)),bkg.nam)
load occurrence sites for the species (column names should be x,y)
file<-"https://www.unil.ch/ecospat/files/live/sites/ecospat/files/shared/Files_site/CStoebe_EU_NA.txt" occ.sp.aggr.dupl<-read.delim(file,h=T,sep="\t")
remove duplicated occurrences
occ.sp.aggr<-occ.sp.aggr.dupl[!duplicated(occ.sp.aggr.dupl),]
remove occurrences closer than a minimum distance to each other (remove aggregation). Setting min.dist=0 will remove no occurrence. If your dataset is in degree, 0.008333 correspond to 1km at the equator.
occ.sp<-ecospat.occ.desaggregation(occ.sp.aggr,min.dist = 1/60*10) occ.eu<-base::subset(occ.sp,region=="EU",select = c(x,y)) occ.nam<-base::subset(occ.sp,region=="NAM",select = c(x,y))
create sp occurrence dataset by extracting climate variables from the rasters
env.bkg.eu<-na.exclude(data.frame(extract(clim,bkg.eu))) env.bkg.nam<-na.exclude(data.frame(extract(clim,bkg.nam))) env.occ.eu<-na.exclude(extract(clim,occ.eu)) env.occ.nam<-na.exclude(extract(clim,occ.nam))
check spatial autocorrelation
ecospat.mantel.correlogram(dfvar=cbind(occ.eu,env.occ.eu),colxy=1:2,n=200,nclass=10,nperm=20) ecospat.mantel.correlogram(dfvar=cbind(occ.nam,env.occ.nam),colxy=1:2,n=200,nclass=10,nperm=20)
#
niche quantifications
#
calibration of PCA-env
pca.env <-dudi.pca(rbind(env.bkg.eu,env.bkg.nam), center = T, scale = T, scannf = F, nf = 2)
predict the scores on the PCA axes
scores.bkg<- pca.env$li scores.bkg.eu<- suprow(pca.env,env.bkg.eu)$lisup scores.bkg.nam<- suprow(pca.env,env.bkg.nam)$lisup scores.occ.eu<- suprow(pca.env,env.occ.eu)$lisup scores.occ.nam<- suprow(pca.env,env.occ.nam)$lisup
calculation of occurence density
z1<- ecospat.grid.clim.dyn(scores.bkg,scores.bkg.eu,scores.occ.eu,R=100) z2<- ecospat.grid.clim.dyn(scores.bkg,scores.bkg.nam,scores.occ.nam,R=100) equ<-ecospat.niche.equivalency.test(z1,z2,rep=10)
test of niche equivalency and similarity
sim1<-ecospat.niche.similarity.test(z1,z2,rep=100,alternative = "greater",rand.type = 1) #niches randomly shifted in both area sim2<-ecospat.niche.similarity.test(z1,z2,rep=100,alternative = "greater",rand.type = 2) #niche randomly shifted only in invaded area
overlap corrected by availabilty of background conditions
ecospat.niche.overlap(z1,z2,cor=T)
uncorrected overlap
ecospat.niche.overlap(z1,z2,cor=F)
#
niche visualizations
#
occurrence density plots
ecospat.plot.niche(z1,title="PCA-env - EU niche",name.axis1="PC1",name.axis2="PC2") ecospat.plot.niche(z2,title="PCA-env - NA niche",name.axis1="PC1",name.axis2="PC2")
contribution of original variables
ecospat.plot.contrib(pca.env$co,pca.env$eig)
niche tests plots
ecospat.plot.overlap.test(equ,"D","Equivalency") ecospat.plot.overlap.test(sim1,"D","Similarity 1<->2") #niches randomly shifted in both areas ecospat.plot.overlap.test(sim2,"D","Similarity 1->2") #niche randomly shifted only in invaded area
ecospat.plot.niche.dyn(z1,z2,quant=0.5,name.axis1="PC1",name.axis2="PC2",interest=1) ecospat.plot.niche.dyn(z1,z2,quant=0.5,name.axis1="PC1",name.axis2="PC2",interest=2)
occurrence density in geography
plot(ecospat.niche.zProjGeo(z1,clim.bkg.eu)) points(occ.eu,cex=0.2,pch=19) plot(ecospat.niche.zProjGeo(z2,clim.bkg.nam)) points(occ.nam,cex=0.2,pch=19)
t2=Sys.time() t2-t1 #Time difference of 1.554178 mins
###### 9.2.2.2.2 R:ecospat包异域分布脚本
```{r eval =F}
## written by Olivier Broennimann. Departement of Ecology and Evolution (DEE).
## 16.01.2019. University of Lausanne. Department of ecology and evolution, Switzerland
t1=Sys.time()
#install.packages("devtools")
#library(devtools)
#install_github(repo="ecospat/ecospat/ecospat",ref="3.1")
library(ecospat)
library(ade4)
library(raster)
library(rworldmap)
###########################
#### data preparation #####
###########################
# get climate data for the two area
clim<-getData('worldclim', var='bio', res=10)
# choice of variables
var<-c("bio1","bio5","bio7","bio12","bio16","bio18")
clim<-clim[[which(names(clim)%in%var)]]
# create mask for native area background
data(countriesCoarseLessIslands) #countries from rworldmap
ctry = c("AUT","BEL","BGR","HRV","CYP","CZE","DNK","EST","FIN","FRA","DEU","GRC","HUN","IRL",
"ITA","LVA","LTU","LUX","MLT","NLD","POL","PRT","ROU","SVK","SVN","ESP","SWE","GBR",
"NOR","CHE","BLR","UKR","MDA","MNE","SRB","BIH","ALB","MKD")
bkg.eu<-aggregate(countriesCoarseLessIslands[countriesCoarseLessIslands$ADM0_A3%in%ctry,])
# create mask for invaded area background
ctry = c("CAN", "USA")
bkg.nam<-aggregate(countriesCoarseLessIslands[countriesCoarseLessIslands$ADM0_A3%in%ctry,])
# extract climate data from the rasters
clim.bkg.eu<-mask(crop(clim,bbox(bkg.eu)),bkg.eu)
clim.bkg.nam<-mask(crop(clim,bbox(bkg.nam)),bkg.nam)
# load occurrence sites for the species (column names should be x,y)
file<-"https://www.unil.ch/ecospat/files/live/sites/ecospat/files/shared/Files_site/CStoebe_EU_NA.txt"
occ.sp.aggr.dupl<-read.delim(file,h=T,sep="\t")
# remove duplicated occurrences
occ.sp.aggr<-occ.sp.aggr.dupl[!duplicated(occ.sp.aggr.dupl),]
# remove occurrences closer than a minimum distance to each other (remove aggregation). Setting min.dist=0 will remove no occurrence. If your dataset is in degree, 0.008333 correspond to 1km at the equator.
occ.sp<-ecospat.occ.desaggregation(occ.sp.aggr,min.dist = 1/60*10)
occ.eu<-base::subset(occ.sp,region=="EU",select = c(x,y))
occ.nam<-base::subset(occ.sp,region=="NAM",select = c(x,y))
# create sp occurrence dataset by extracting climate variables from the rasters
env.bkg.eu<-na.exclude(data.frame(extract(clim,bkg.eu)))
env.bkg.nam<-na.exclude(data.frame(extract(clim,bkg.nam)))
env.occ.eu<-na.exclude(extract(clim,occ.eu))
env.occ.nam<-na.exclude(extract(clim,occ.nam))
#check spatial autocorrelation
ecospat.mantel.correlogram(dfvar=cbind(occ.eu,env.occ.eu),colxy=1:2,n=200,nclass=10,nperm=20)
ecospat.mantel.correlogram(dfvar=cbind(occ.nam,env.occ.nam),colxy=1:2,n=200,nclass=10,nperm=20)
################################
#### niche quantifications #####
################################
#calibration of PCA-env
pca.env <-dudi.pca(rbind(env.bkg.eu,env.bkg.nam), center = T, scale = T, scannf = F, nf = 2)
# predict the scores on the PCA axes
scores.bkg<- pca.env$li
scores.bkg.eu<- suprow(pca.env,env.bkg.eu)$lisup
scores.bkg.nam<- suprow(pca.env,env.bkg.nam)$lisup
scores.occ.eu<- suprow(pca.env,env.occ.eu)$lisup
scores.occ.nam<- suprow(pca.env,env.occ.nam)$lisup
# calculation of occurence density
z1<- ecospat.grid.clim.dyn(scores.bkg,scores.bkg.eu,scores.occ.eu,R=100)
z2<- ecospat.grid.clim.dyn(scores.bkg,scores.bkg.nam,scores.occ.nam,R=100)
equ<-ecospat.niche.equivalency.test(z1,z2,rep=10)
# test of niche equivalency and similarity
sim1<-ecospat.niche.similarity.test(z1,z2,rep=100,alternative = "greater",rand.type = 1) #niches randomly shifted in both area
sim2<-ecospat.niche.similarity.test(z1,z2,rep=100,alternative = "greater",rand.type = 2) #niche randomly shifted only in invaded area
# overlap corrected by availabilty of background conditions
ecospat.niche.overlap(z1,z2,cor=T)
# uncorrected overlap
ecospat.niche.overlap(z1,z2,cor=F)
################################
#### niche visualizations #####
################################
# occurrence density plots
ecospat.plot.niche(z1,title="PCA-env - EU niche",name.axis1="PC1",name.axis2="PC2")
ecospat.plot.niche(z2,title="PCA-env - NA niche",name.axis1="PC1",name.axis2="PC2")
# contribution of original variables
ecospat.plot.contrib(pca.env$co,pca.env$eig)
# niche tests plots
ecospat.plot.overlap.test(equ,"D","Equivalency")
ecospat.plot.overlap.test(sim1,"D","Similarity 1<->2") #niches randomly shifted in both areas
ecospat.plot.overlap.test(sim2,"D","Similarity 1->2") #niche randomly shifted only in invaded area
ecospat.plot.niche.dyn(z1,z2,quant=0.5,name.axis1="PC1",name.axis2="PC2",interest=1)
ecospat.plot.niche.dyn(z1,z2,quant=0.5,name.axis1="PC1",name.axis2="PC2",interest=2)
# occurrence density in geography
plot(ecospat.niche.zProjGeo(z1,clim.bkg.eu))
points(occ.eu,cex=0.2,pch=19)
plot(ecospat.niche.zProjGeo(z2,clim.bkg.nam))
points(occ.nam,cex=0.2,pch=19)
t2=Sys.time()
t2-t1 #Time difference of 1.554178 mins
9.2.2.3 R:ecosapt包群落生态学的研究
## 解释和说明:
## 在ecospat包中对群落生态学的演仍旧遵循了群落生态学研究的基本范式——排序、聚类和空间结构;在栅格单元的尺度上统计栅格单元范围内的种间presence/absence
## R实现:
## 衡量种间互作:是比较同一栅格单元的不同物种同时出现的数量;
## 返回结果是统计每个物种与其他匹配分析的物种在构建栅格集合时,同一栅格中共同出现的概率;然后对概率矩阵进行绘制箱线图来反映每个物种与其他物种的范围重叠性,也在一定程度上间接的反映了种间关联度(## 群落互做的重要指标)
## 另外查到在研究种间关联分析方面,张金龙教授的工作非常出色:早在2010年就开发了spaa包用于群落生态学的样方数据评估;
data <- ecospat.testData[c(9:16,54:57)]
ecospat.co_occurrences (data) ##返回每个物种
## 群落系统发育生态学:构建群落系统发育树,返回每个栅格内的物种多样性的统计值;结果是绘制系统发育多样性与物种丰度之间的关系图:(##不太懂)
pd<- ecospat.calculate.pd(tree, data, method = "spanning", type = "species", root = TRUE, average = FALSE, verbose = TRUE )
## 统计
9.2.2.3 R:ecosapt包特色脚本:
```{r eval =F}
绘制单维度变量的适生概率和实际分布概率值的差异核平滑图在单维层面;
其实这如果需要比较多个物种,或许可以使用箱线图;或者小提琴图;
这个图有用的一点是可以反映建模结果的约束性在单维空间投影上的选择偏好,这种选择偏好,尽管可以通过种群实际分布体现(但实际地理分布往往会存在种群发育阶段的问题),因此在环境尺度上的对物种分布概率建模结果和实际环境信息的比较能够在某种程度上减少对种群发育信号的相应差异。
gridding the native niche
grid.clim.t.nat <- ecospat.grid.clim.dyn(glob=as.data.frame(rbind(nat,inv)[,10]), glob1=as.data.frame(nat[,10]), sp=as.data.frame(nat[which(nat[,11]==1),10]), R=1000, th.sp=0)
gridding the invaded niche
grid.clim.t.inv <- ecospat.grid.clim.dyn(glob=as.data.frame(rbind(nat,inv)[,10]), glob1=as.data.frame(inv[,10]), sp=as.data.frame(inv[which(inv[,11]==1),10]), R=1000, th.sp=0) t.dyn<-ecospat.niche.dyn.index (grid.clim.t.nat, grid.clim.t.inv, intersection=0.1) ecospat.plot.niche.dyn(grid.clim.t.nat, grid.clim.t.inv, quant=0, interest=2, title= "Niche Overlap", name.axis1="Average temperature")
涉及迁移速率研究:
NOT RUN {
ecospat.migclim()
Some example data files can be downloaded from the following web page:
http://www.unil.ch/ecospat/page89413.html
#
Run the example as follows (set the current working directory to the
folder where the example data files are located):
#
library(MigClim) data(MigClim.testData)
Run MigClim with a data frame type input.
n <- MigClim.migrate (iniDist=MigClim.testData[,1:3], hsMap=MigClim.testData[,4:8], rcThreshold=500, envChgSteps=5, dispSteps=5, dispKernel=c(1.0,0.4,0.16,0.06,0.03), barrier=MigClim.testData[,9], barrierType="strong", iniMatAge=1, propaguleProd=c(0.01,0.08,0.5,0.92), lddFreq=0.1, lddMinDist=6, lddMaxDist=15, simulName="MigClimTest", replicateNb=1, overWrite=TRUE, testMode=FALSE, fullOutput=FALSE, keepTempFiles=FALSE)
}
##### 9.2.2.3 R:humboldt包
###### 9.2.2.3.1 R:humboldt原理
- [ ] ```R
## 文献引用:
Brown, J.L. & Carnaval, A.C. (2019) A tale of two niches: methods, concepts and evolution. Frontiers of Biogeography, 11, e44158. doi:10.21425/F5FBG44158
## github代码:
https://github.com/jasonleebrown/humboldt
## 工作文档说明:
大多数物种当代分布处于非平衡状态,因此,其生态位(占用,潜力和可用基础)的地理表现会随时间动态变化。新方法提供了一些定量的进展,这些进展描述了两种物种分布中可及的气候以及非相似和相似气候之间的对应关系。总体而言,新方法提高了小生境相似性量化和相应统计测试的准确性,在正确量化具有已知真相的模拟数据中的小生境等效性和差异方面,始终优于同类测试。
### 特色工作,主要提供两种工作分析:
1、潜在niche截断指数:
大多数关于生态位差异的研究都忽略了以当代分布为特征的被占领生态位的好坏程度。 为了提供了解这种关系的第一步,``洪堡''提供了一种量化物种占据的电子空间被环境中可用电子空间截断的可能性的方法(请参阅Brown&Carnaval 2019)在E空间中被截断的生态位比例越大,被占据的生态位可能无法很好地反映基本生态位的风险就越高。 根据物种的E空间与相邻栖息地中可用空间之间的关系,我们可以评估观察到的E空间被截断的风险,以及我们有多大可能低估了物种的基本生态位(参见Brown&Carnaval 2019) 在这里,我们介绍了一种新的量化方法来衡量这一点:潜在生态位截断指数(PNTI)。 它描述了被可用电子空间截断的被观察物种的电子空间量。 具体来说,它是对物种E空间的5%内核密度等值线与可及的环境E空间的10%内核密度等值线之间重叠的度量。 PNTI是物种等值线中属于环境等值线之外的部分。 此值从物理上衡量物种的E空间的周长与环境E空间的边界邻接,重叠或超出其范围。 如果该值较大,则存在中等风险(PNTI = 0.15-0.3)或高风险(PNTI> 0.3),由于有限的可用E空间驱动的生态位截断,所测得的生态位不能反映物种的基本生态位。
(## 简言之,在多数情况下由于取样涉及的原因,以及物种的分布呈现不均衡状态的原因(动态非平衡状态),此时基于环境空间构建的物种生态位本身是不能全面反映所实际占据的基本生态位特征的,这样会导致无法反映种的本质特征。可检测的方法是在二维环境因子的基础上构建物种分布的主成分分析空间,基于专家意见和建模分析获取该种的背景空间信息,在背景空间信息的基础上构建物种的现实生态位空间,并评估现实生态位所构建的环境空间生态位的被现实环境因子所截断的情况(使用潜在截断指数来评估,评估的数学方法是构建重叠范围评估))
2、生态位重叠和分化测试:
引用文献: (Brown & Carnaval 2009)
该测试可以评估物种之间的差异是否来自真正的生态位发散。新方法提高了利基相似度和相关测试的准确性,始终优于其他测试。我们的方法描述了物种分布中非相似气候和相似气候之间的关系,这是以前无法获得的。通过这些改进,可以评估两个类群所占据的不同环境空间是否来自真正的生态位演化,而不是生活史和生物相互作用因子的差异,还是它们可接近的环境的多样性和配置的差异。
9.2.2.3.2 R:构筑 Humbolt 格式的环境数据
```{r eval=FALSE} 1、将栅格数据导入R library(raster)
load rasters to sample, here Tiff rasters are in base working directory
BIO1 = raster("bio1.tif") BIO2 = raster("bio2.tif") BIO12 = raster("bio12.tif") BIO15= raster("bio15.tif")
2、将范围缩小到分析区域
create raster stack to speed process up
raster_stack_full<-stack(BIO1, BIO2, BIO12, BIO15)
define extent for which you will clip data to.
I highly recommend determining these values from a GIS (ArcGIS or Google Earth).
extent order input below is: xmin, xmax, ymin, yma
e.in <- extent(-160, 10, 30, 60)
perform crop function to clip to input extent
raster_stack <- crop(raster_stack_full, e.in)
3、将输入栅格转换为“humboldt”格式:
convert one of the rasters to a point dataframe to sample. Use any raster input.
env.points<-rasterToPoints(BIO1, fun=NULL, spatial=FALSE)
rarefy points to appropriate analysis resolution. Note it is best to start with climate data that is similar to the desired resolution. Else this process can take a lot of time. If climate is exactly the resolution desired (we recommend 10-40km2 for most studies), this step can be skipped.
env.sampling.res<-humboldt.occ.rarefy(env.points, colxy = 1:2, rarefy.dist = 40, rarefy.units = "km", run.silent.rar = F)
subset only the x and y data
env.sampling.res<- env.sampling.res[,1:2]
Extract values to points from rasters
RAST_VAL<-data.frame(extract(raster_stack, env.sampling.res))
merge sampled data to input
Env1<-cbind(env.sampling.res,RAST_VAL)
save the file as '.csv' for future analyses
write.csv(Env1, file = "Env1.csv")
if necessary, repeat for environment 2
###### 9.2.2.3.3 R:潜在截断指数
```R
## 案例:
library(humboldt)
##load environmental variables for all sites of the study area 1 (env1). Column names should be x,y,X1,X2,...,Xn)
data(env1)
## load environmental variables for all sites of the study area 2 (env2). Column names should be x,y,X1,X2,...,Xn)
data(env2)
## remove NAs and make sure all variables are imported as numbers
env1<-humboldt.scrub.env(env1)
env2<-humboldt.scrub.env(env2)
##load occurrence sites for the species at study area 1 (env1). Column names should be sp,x,y
data(sp1)
##load occurrence sites for the species at study area 2 (env2). Column names should be sp,x,y
data(sp2)
##its highly recommended that you using the function "humboldt.top.env" to select only the important environmental variables in humboldt.doitall. This step can be skipped. If you downloaded tons of environmental data, you should use this step.
reduc.vars<- humboldt.top.env(env1=env1,env2=env2,sp1=sp1,sp2=sp2,rarefy.dist=50, rarefy.units="km", env.reso=0.416669,learning.rt1=0.01,learning.rt2=0.01,e.var=(3:21),pa.ratio=4,steps1=50,steps2=50,method="contrib",contrib.greater=5)
##Adjust the number of variables input for e.vars after reduction to only important variables
num.var.e<-ncol(reduc.vars$env1)
##convert geographic space to espace for measuring pnt.index
zz<-humboldt.g2e(env1=reduc.vars$env1, env2=reduc.vars$env2, sp1=sp1, sp2=sp2, reduce.env = 0, reductype = "PCA", non.analogous.environments = "YES", env.trim= T, e.var=c(3:num.var.e), col.env = e.var, trim.buffer.sp1 = 200, trim.buffer.sp2 = 200, rarefy.dist = 50, rarefy.units="km", env.reso=0.41666669, kern.smooth = 1, R = 100, run.silent = F)
##store espace scores for sp1 and environments 1,2 and both environments combined output from humboldt.g2e
scores.env1<-zz$scores.env1[1:2]
scores.env2<-zz$scores.env2[1:2]
scores.env12<- rbind(zz$scores.env1[1:2],zz$scores.env2[1:2])
scores.sp1<-zz$scores.sp1[1:2]
scores.sp2<-zz$scores.sp2[1:2]
##estimate the Potential Niche Truncation Index
pnt1<- humboldt.pnt.index(scores.env12,scores.env1,scores.sp1,kern.smooth=1,R=100)
pnt2<- humboldt.pnt.index(scores.env12,scores.env2,scores.sp2,kern.smooth=1,R=100)
9.2.2.3.4 R:生态位重叠和分化
```{r eval=FALSE}
案例:
library(humboldt)
load environmental variables for all sites of the study area 1 (env1). Column names should be x,y,X1,X2,...,Xn)
data(env1)
load environmental variables for all sites of the study area 2 (env2). Column names should be x,y,X1,X2,...,Xn)
data(env2)
remove NAs and make sure all variables are imported as numbers
env1<-humboldt.scrub.env(env1) env2<-humboldt.scrub.env(env2)
load occurrence sites for the species at study area 1 (env1). Column names should be sp,x,y
data(sp1)
load occurrence sites for the species at study area 2 (env2). Column names should be sp,x,y
data(sp2)
its highly recommended that you using the function "humboldt.top.env" to select only the important environmental variables in humboldt.doitall.
This step can be skipped. If you downloaded tons of environmental data, you should use this step. If you skip this step, input env1/env2 in place of reduc.vars$env1/reduc.vars$env2
reduc.vars<- humboldt.top.env(env1=env1,env2=env2,sp1=sp1,sp2=sp2,rarefy.dist=50, rarefy.units="km", env.reso=0.416669,learning.rt1=0.01,learning.rt2=0.01,e.var=(3:21),pa.ratio=4,steps1=50,steps2=50,method="contrib",contrib.greater=5)
Adjust the number of variables input for e.vars after reduction to only important variables
num.var.e<-ncol(reduc.vars$env1)
run it first with full environmental for background tests and equivalence statistic (total equivalence or divergence in current distributions)
full<-humboldt.doitall(inname="full_extent", env1=reduc.vars$env1, env2=reduc.vars$env2, sp1=sp1, sp2=sp2, rarefy.dist=50, rarefy.units="km", env.reso=0.416669, reduce.env=0, reductype="PCA", non.analogous.environments="YES", correct.env=T, env.trim=T, env.trim.type="RADIUS", trim.buffer.sp1=500, trim.buffer.sp2=500, pcx=1, pcy=2, col.env=e.var, e.var=c(3:num.var.e), R=100, kern.smooth=1, e.reps=100, b.reps=100, nae="YES",thresh.espace.z=0.001, p.overlap=T, p.boxplot=F, p.scatter=F, run.silent=F, ncores=2)
run it a second time with a trimmed, shared-espace. Here the equivalence statistic tests for niche evolution or niche divergence. For comparing results, change only the following model parameters: reduce.env, non.analogous.environmental, env.trim, nae
shared_ae<-humboldt.doitall(inname="shared_espace_ae", env1=reduc.vars$env1, env2=reduc.vars$env2, sp1=sp1, sp2=sp2, rarefy.dist=50, rarefy.units="km", env.reso=0.416669, reduce.env=2, reductype="PCA", non.analogous.environments="NO", correct.env=T, env.trim=T, env.trim.type="RADIUS", trim.buffer.sp1=500, trim.buffer.sp2=500, pcx=1,pcy=2, col.env=e.var, e.var=c(3:num.var.e), R=100, kern.smooth=1, e.reps=100, b.reps=100, nae="NO",thresh.espace.z=0.001, p.overlap=T, p.boxplot=F, p.scatter=T,run.silent=F, ncores=2)
```{r eval=FALSE}
## Example 2 - typical workflow
library(humboldt)
##load environmental variables for all sites of the study area 1 (env1). Column names should be x,y,X1,X2,...,Xn)
##in this example all input datasets are tab-delimited text files, if using '.csv' files change the parameters below for import steps from 'sep="\t"' to 'sep=","'
env1<-read.delim("env1.txt",h=T,sep="\t")
## load environmental variables for all sites of the study area 2 (env2). Column names should be x,y,X1,X2,...,Xn)
env2<-read.delim("env2.txt",h=T,sep="\t")
## remove NAs and make sure all variables are imported as numbers
env1<-humboldt.scrub.env(env1)
env2<-humboldt.scrub.env(env2)
##load occurrence sites for the species at study area 1 (env1). Column names should be sp,x,y
sp1<-na.exclude(read.delim("sp1.txt",h=T,sep="\t"))
##load occurrence sites for the species at study area 2 (env2). Column names should be sp,x,y
sp2<-na.exclude(read.delim("sp2.txt",h=T,sep="\t"))
##its highly recommended that you using the function "humboldt.top.env" to select only the important environmental variables in humboldt.doitall.
##This step can be skipped. If you downloaded tons of environmental data, you should use this step. If you skip this step, input env1/env2 in place of reduc.vars$env1/reduc.vars$env2
reduc.vars<- humboldt.top.env(env1=env1,env2=env2,sp1=sp1,sp2=sp2,rarefy.dist=50, rarefy.units="km", env.reso=0.416669,learning.rt1=0.01,learning.rt2=0.01,e.var=(3:21),pa.ratio=4,steps1=50,steps2=50,method="contrib",contrib.greater=5)
##Adjust the number of variables input for e.vars after reduction to only important variables
num.var.e<-ncol(reduc.vars$env1)
##run it first with full environmental for background tests and equivalence statistic (total equivalence or divergence in current distributions)
full<-humboldt.doitall(inname="full_extent", env1=reduc.vars$env1, env2=reduc.vars$env2, sp1=sp1, sp2=sp2, rarefy.dist=50, rarefy.units="km", env.reso=0.416669, reduce.env=0, reductype="PCA", non.analogous.environments="YES", correct.env=T, env.trim=T, env.trim.type="RADIUS", trim.buffer.sp1=500, trim.buffer.sp2=500, pcx=1, pcy=2, col.env=e.var, e.var=c(3:num.var.e), R=100, kern.smooth=1, e.reps=100, b.reps=100, nae="YES",thresh.espace.z=0.001, p.overlap=T, p.boxplot=F, p.scatter=F, run.silent=F, ncores=2)
##run it a second time with a trimmed, shared-espace. Here the equivalence statistic tests for niche evolution or niche divergence. For comparing results, change only the following model parameters: reduce.env, non.analogous.environmental, env.trim, nae
shared_ae<-humboldt.doitall(inname="shared_espace_ae", env1=reduc.vars$env1, env2=reduc.vars$env2, sp1=sp1, sp2=sp2, rarefy.dist=50, rarefy.units="km", env.reso=0.416669, reduce.env=2, reductype="PCA", non.analogous.environments="NO", correct.env=T, env.trim=T, env.trim.type="RADIUS", trim.buffer.sp1=500, trim.buffer.sp2=500, pcx=1,pcy=2, col.env=e.var, e.var=c(3:num.var.e), R=100, kern.smooth=1, e.reps=100, b.reps=100, nae="NO",thresh.espace.z=0.001, p.overlap=T, p.boxplot=F, p.scatter=T,run.silent=F, ncores=2)
9.2.2.4 R:biomod2包
9.2.2.5 arcgis:enmtools包
9.2.2.6 nichevol包
nichevol:考虑重建不确定性的物种生态位进化评估
## 该nichevol [R包帮助用户完成关键的步骤,在物种的生态位进化的评估过程中,在重建明确纳入不确定性。这里提出的生态位祖先重建的方法使用结合了估计不确定性的基于bin的方法来表征生态位。与其他现有方法相比,此处介绍的方法减少了高估生态位演化量或速率的风险。主要分析包括:对发生记录和可及区域中的环境数据进行初步探索,为系统发育分析准备数据,进行系统发育比较分析以及为解释作图。
## 安装R包:
install.packages("nichevol")
## git网址:
https://github.com/helixcn/nichevol